Sequence length 4052 



CCAAGATTTAAAGCCCGCAAOTTTTGTTCTTGAGACCAGCGACTTTAGCTCCGAT^ 
TTTGGACATTTAAAGAGCTGGGCTTGAACTTCGTGAGTTTCG^ 

TGGCATGGAATATTCACATGGGAGAGCCGCATGAGGCCGCCCACCACGCTTCCTGAAGGATGCCCGTGTGGAAGAATTT 
TGACGTGCCAGTGTCCTCGTTCTACAGGGTGTTCCATTCTTCCGCAATCTCAGAAAAATGGGACTAAAAGAAACTATTT 
TGTAAAATAAGAAGACTTCCATTTTTAATGACCAACATGTATTAAGATGGACACCTACTCTACGAAACACGAAGTTCTA 

M R M L 4 

TGGTCTCGAAGAAGCCCGTGCCTGTTTAAAACTGATCCTAACTAAAAACAGACTTGAGTGGAT ATG AGA ATG TTG 12 

VSGRRVKKWQL.XIQLFATCF 24 

GTT AGT GGC AGA AGA GTC AAA AAA TGG GAG TTA ATT ATT GAG TTA TTT GCT ACT TGT TTT 72 

LASLMFFWEPIDNHIVSHMK 44 

TTA GCG AGC CTC ATG TTT TTT TGG GAA CCA ATC GAT AAT CAC ATT GTG AGC CAT ATG AAG 132 

j^^YSYRYLINSYDFVNDTLSL 64 

^A TAT TCT TAC AGA TAG CTC ATA AAT AGC TAT GAC TTT GTG AAT GAT ACC CTG TCT CTT 192 

J^HTSAGPRYQYLINHKEKCQ 84 

%kG CAC ACC TCA GCG GGG CCT CGC TAC CAA TAC TTG ATT AAC CAC AAG GAA AAG TGT CAA 252 

^ QDVLIiLLFVKTAPENYDRR 104 

&T CAA GAC GTC CTC CTT TTA CTG TTT GTA AAA ACT GCT CCT GAA AAC TAT GAT CGA CGT 312 

Ci&GIRRTWGNENYVRSQIiNAN 124 

ifcC GGA ATT AGA AGG ACG TGG GGC AAT GAA AAT TAT GTT CGG TCT CAG CTG AAT GCC AAC 372 

2lKTLFALGTPNPLEGEEIiQR 144 

AAA ACT CTG TTT GCC TTA GGA ACT CCT AAT CCA CTG GAG GGA GAA GAA CTA CAA AGA 432 

^"^K LAWEDQRYKDIIQQDFVDS 164 

AAA CTG GCT TGG GAA GAT CAA AGG TAC AAT GAT ATA ATT CAG CAA GAC TTT GTT GAT TCT 492 

FYNLTLKLLMQFSWANTYCP 184 

TTC TAC AAT CTT ACT CTG AAA TTA CTT ATG CAG TTC AGT TGG GCA AAT ACC TAT TGT CCA 552 

H A KFLMTADDDXFIHMPNLI 204 

CAT GCC AAA TTT CTT ATG ACT GCT GAT GAT GAC ATA TTT ATT CAC ATG CCA AAT CTG ATT 612 

EYLQSLEQIGVQDFWIGRVH 224 

GAG TAC CTT CAA AGT TTA GAA CAA ATT GGT GTT CAA GAC TTT TGG ATT GGT CGT GTT CAT 672 

RGAPPIRDKSSKYYVSYEMY 244 

CGT GGT GCC CCT CCC ATT AGA GAT AAA AGC AGC AAA TAC TAC GTG TCC TAT GAA ATG TAC 732 

QWPAYPDYTAGAAYVISGDV 264 

CAG TGG CCA GCT TAC CCT GAC TAC ACA GCC GGA GCT GCC TAT GTA ATC TCC GGT GAT GTA 792 

AAKVYEASQTIiNSSLYIDDV 284 

GCT GCC AAA GTC TAT GAG GCA TCA CAG ACA CTA AAT TCA AGT CTT TAC ATA GAC GAT GTG 852 



FMGLCANKIGI 
TTC ATG GGC CTC TGT GCC AAT AAA ATA GGG ATA 

GEGKTPYHPCI 
GGA GAG GGT AAA ACT COT TAT CAT CCC TGC ATC 

HLEDLQDLWKN 
CAC TTA GAA GAT CTC GAG GAC CTT TGG AAG AAT 

SKGFFGQIYCR 
TCC AAA GGT TTT TTT GGT CAA ATA TAG TGC AGA 

ISYVDTYPCRA 
ATT AGO TAT GTG GAC ACA TAG CCT TGT AGG GCT 



VPQDHVFFS 304 

GTA CCG GAG GAC CAT GTG TTT TTT TCT 912 

YEKMMTSHG 324 

TAT GAA AAA ATG ATG ACA TCT CAT GGA 972 

ATDPKVKTI 344 

GCT ACA GAT CCT AAA GTA AAA ACC ATT 1032 

LMKIILLCK 364 

TTA ATG AAG ATA ATT CTC CTT TGT AAA 1092 

A F I * 379 

GCG TTT ATC TAA 1137 



TAGTACTTGAATGTTGTATGTTTTCACTGTCACTGAGTCAAACCTGGATGAAAAAAACCTTTAAATGTTCGTCTATACC 
CTAAGTAAAATGAGGACGAAAGACAAATATTTTGAAAGCCTAGTCCATCAGAATGTTTC^^ 

AATATCACTTATCTACTTCATTGCCTAAGTTCATTTCAAAGAATTTGTATTTAGAAAAGGTTTATATTATTAGTGAAAA 
lAAi^CTAAAGGGAAGTTCAAGTTCTCATGTAATGCCACATATATACTTGAGGTGTAGAGATGTTATTAAGAAGTTTTG 
fix^TTAGAATAATTGCTTTTGGAAAATACCAAATGAACGTACAGTACAACATTTCAAGGAAATGAATATATTGTTAGAC 

IgAGGTAAGCAAGTTTATTTTTGTTAAAGAGCACTTGGTGGAGGTAGTAGGG 

.^^TCATGAATCTGGTAAAACAGTCTCTTGTTCTT 

f 

.^^AATCACGTCTTACCACATCCATGTAGCTACTGGTGTTAGAGTCATTAAAATACCTTTTTTTGCATCTTT 
J^TAATGTGAACTTTTAGAAAAGTGATTAATGTT^ 
I^^IAAAATGACACATAACACGGGCAGCTGGTTGCTCATAGG^ 
^,,gK3ATTTTAATGACGTTTTCAACTGGTTTTT^ 

ACATAGAGGAATATAATAATGGAGAGACTTCAAATGGAAAGACAGAACATTACAAGCCTAATGTCTC 

AAATGAAATCTTAGTGTCTAAATCCTTGTACTGATTACTAAAATTAACCCT^CTCCTCC^ 

CAGCACTTTGTTCCAAGTTCAGAGTTTTAAATTGAGAGCATTAAACATCAAAGOT 

TCATCAATAACTGTCAGAGGTGATCTTTATTTTCTAAATATTTCAAA^ 

TTGCCAGTTTGGGGTTAAAGCATTTTTAAAGCTGCATGTTCCTT^ 

AAGTTACATTTATTTTACAAAGCAGGATAAAAATGTGGCTATAATACACACTACCTCCCTTC^ 

GTGGTGTCTACTGCTAGGGAGATTATATGAAGGCCAAAATAATGACTTCAGCAAGAGTGACTGAACT^ 

TTTGACTGCAGAGGCACCTGTTAGGGAAAATCAGATGTCTCATATAATAAGGTGA 

GAAAAAAGATTTCTCAGTATACACAACTGAATGATGATACTTACAATTTTTAGCAGG^ 

ATTTTAATTTTTTTCTATTTTGAAATTTGAGGCTTGTTTACATTG^ 




4- /B 



ACTACAGTGTCAAACATTCTAGGTTGTAGTTACTTTCAGAGTAGATACAGGGTTT^ 

TGACCAATTAAAAAAACATAGAGAACAAAAGCATATTTGACCAAGCAACAAGCTTATAATTi^ 

GATTAATGATGTATTGCCTTTTGCCCATATATACCCTGTGTATCTATACTTGGAAGTO 

AAACATAAGTGTCTCTGGCCATCAAAGTGATCTTGTTTACAGCAGTGCTTTTGTGAAACAATTATTTATTTGCTGAAAG 
AGCTCTTCTGAACTGTGTCCTTTTAATTTTTGCTTAGAATAGAATGGAACAAGTTTAAATT 

ACTTCCTTTTTTTCTAAGAAGGAAGTTGCTAGATGATTCCTTCATCACACTTACTTAAAGTACTGAGAAGAGTATCTGT 
AAATAAAAGGGTTCCAACCTTTTAAAAAAGAAGGAAAAAACTTTTTGGTGCTCCAGTGTAGGGCTATCTT^ 
TGTCAACAAAGGGAAAATAAACTATCAGCTTGGATGGTCACTTGAATAGAAGATGGTTATACACAGTGTTATTGTTAAA 
ATTTTTTTACCTTTTGGTTGGTTTGCATCTTTTTTCC 

TTGAATTTTGCTCTTGTATGGCAAAATAATTAGTGAGTTTAAAAAAAATCTATAGTTTCCAATAAACAACTC 
@AAAAAA 




Analysis of 8797 (378 aa) 
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>8797 

MRMLVSGRRVKJCWQLIIQLFATCFLASLMFFWEPIDNHIVSHMKSYS 

TLSIiKHTSAGPRYQYLIimKEKCQAQDVLLLLFVKTAPENyDRRSGIRRaTWGNE]^^ 

LNANIKTIiFAIiGTPNPLEGEEI^RKLAWEIX^RYNDIIQQDFVDSF 

TYCPHAKFri]yPrADDDXFIHMPNLIEYI.QSLEQIGVQDFWIGRVHRGAPPIRDKSSKYYVS 
YEMyQWPAYPDYTAGAAYVISGDVAAKVYEASQTIiNSSLYIDDVFMGIK:ANKIGXVPQDH 
WFSGEGKTPYHPCIYEKmTSHGHLEDLQDLWKNATDPKVKTISKGFFGQIYCRLMKII 
LLCKISYVDTYPCRAAFI 
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Protein Family / Domain Matches, HMMer version 2 

Searching for complete domains in PFAM 
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hmo^fam - search a single seq against HMM database 
HMMER 2.1,1 (Dec 1998) 

Copyright <C) 1992-1998 Washington University School of Medicine 
HMMER is freely distributed under the GNU General Public License (GPL) 



HMM file: 
Sequence file: 



/prod/ddm/seqanaiyPFAM/pf amS , 4/Pf am 
/prod/ddm/wspace/orf anal/oa-script . 19955 . seq 



Query: 8797 

Scores for sequence family classification (score includes all domains) : 
Model Description Score E-value N 



Galactosyl^ Galactosyl transferase 

Parsed for domains : 

Model Domain seq-f seq-t 



173.8 



2,8e-48 



hmm-f hmm-t 



Galactosyl_T 1/1 



102 



321 



249 {] 



score E-value 



173-8 2.8e-48 



Alignments of top-scoring domains: 

Galactosyl_T: domain 1 of 1, from 102 to 321: score 173.8^ E = 2 . 8e-48 

*->arRnaiRkTVftnnqnnsegvadgrikalFlvGl , sakgdqklkklvme 
+rR iR+TW+n+n++++ ++ ik+lF +G++++++-1-++I++ + + 
8797 102 DRRSGIRRTOGlsrElsmmSQIiNANIKTLFALGTpNPLEGEEr^RKIx^ 148 

EakrtlyGDiiwDleDsYenlitlKTltillygvskcpsakligKiDdDv 
E++ y Dii + +D+ Ds++nIjtlK 1+ 4-++++++cp+ak+ + DdD+ 
8797 149 EDQ — RYJ^JDIIQQDFTOSFYl^TLKLLMQFSWANTYCPHAKFDMTADDDX 196 

fvnpdkLlslLereniridpsessfyGyiikegepvrrkkskrdWYvppt 
f+ +++L+++L+ i ++++++ G+++++ +p+r k sk Yv+++ 

8797 197 FXHMPNLrlEYLQSL-EQIGVQDFWI-GRVHRGAPPIRDKSSK — YYVSYE 242 

eYpcsrNgnkYPpYvsGpfYllsrdAAplXlkaskhrLr , f IkiEDVliT 
Y + YP Y +G Y++s+d+A ++++as + ++ 1 i+DV++ 

8797 243 MYQWPA YPDYTAGAAYVISGDVAAKVYEASQTL-NsSLYIDDVFM- 286 

Gi laedlgl sr inlpr Is i s tnl f r f hhsQkdndgcdvf awht ahkndpe 
G -f-a+++gl +++ +f++ +++ h++ +e 

8797 287 GLCANKIGIVPQDH VFFSGEGKTPY HPCIYE 317 



8797 



ylif<-* 
++ + 
318 KMMT 321 



Transmembrane Segments Predicted by MEMSAT 
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>B191 

MRMLVSGRRVKKWQLIIQLFATCFIiASIiMFFWEPIDNHIVSHMKSYSYRYLINSYDFViro 

TDSLKHTSAGPRYQYIilNHKEKCQAQDVLIJjriFVKTAPENYDRRSGIRRTWGNEN'^^ 

I^IANIKTLFAIXSTPNPLEGEELQRKriAWEIX^RYiroilQQDFVDSF^^ 

TYCPHAKFLOTADDDIFIHMPl^IEYLQSLEQIGVQDFWIGRVHRGAPPIRDKSSKYYVS 

Y^MYQWPAYPDYTAGAAYVISGDVAAKVYEASQTIJTSSLYIDDVFMGriCAm 

VFFSGEGKTPYHPCI YEKMMTSHGHDEDLQDIiWKNATDPKVKTI SKGFFGQI YCRLMKI I 

LI.CKISYVDTYPCRAAFI 
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